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ABSTRACT 

We create mock pencil-beam redshift surveys from very large cosmological A^-body 
simulations of two Cold Dark Matter cosmogonies, an Einstein-de Sitter model 
(rCDM) and a flat model with Oq = 0.3 and a cosmological constant (ACDM). We use 
these to assess the significance of the apparent periodicity discovered by Broadhurst 
et al. (1990). Simulation particles are tagged as 'galaxies' so as to reproduce observed 
present-day correlations. They are then identified along the past light-cones of hypo- 
thetical observers to create mock catalogues with the geometry and the distance distri- 
bution of the Broadhurst et al. data. We produce 1936 (2625) quasi-independent cat- 
alogues from our rCDM (ACDM) simulation. A couple of large clumps in a catalogue 
can produce a high peak at low wavenumbers in the corresponding one-dimensional 
power spectrum, without any apparent large-scale periodicity in the original redshift 
histogram. Although the simulated redshift histograms frequently display regularly 
spaced clumps, the spacing of these clumps varies between catalogues and there is 
no 'preferred' period over our many realisations. We find only a 0.72 (0.49) per cent 
chance that the highest peak in the power spectrum of a tCDM (ACDM) catalogue 
has a peak-to-noise ratio higher than that in the Broadhurst et al. data. None of the 
simulated catalogues with such high peaks shows coherently spaced clumps with a 
significance as high as that of the real data. We conclude that in CDM universes, 
the regularity on a scale of ~ 130/i~^Mpc observed by Broadhurst et al. has a priori 
probability well below 10""^. 

Key words: cosmology: theory - large-scale structure of the Universe - galax- 
ies: clustering 



1 INTRODUCTION 

The redshift distribution of galaxies in the pencil-beam sur- 
vey of Broadhurst et al. (1990, hereafter BEKS) displayed a 
striking periodicity on a scale of 128/i~^Mpc. This result has 
attracted a good deal of interest over the subsequent decade, 
and the significance and nature of periodicity or regularity 
in the distribution of galaxies has remained the subject of 
a stimulating debate in both observational and theoretical 
cosmology. Although a number of studies have been devoted 
to the BEKS pencil-beam survey and other similar surveys, 
several fundamental questions remain unanswered. 



From the theoretical viewpoint, it is important to de- 
cide whether such apparently periodic galaxy distributions 
can occur with reasonable probability in a Cold Dark Matter 
universe, or require physics beyond the standard paradigm. 
Performing large simulations can directly address this ques- 
tion. The first simulation specifically designed for pencil- 
beam comparisons was that of Park and Gott (1991, here- 
after PG). Their rod-shaped CDM simulation allowed them 
to create twelve quasi-independent mock pencil-beam sur- 
veys similar in length to that of BEKS. One of their sam- 
ples appeared 'more periodic' than the BEKS data according 
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to the particular statistical test they used for comparison. 
Other authors (Kurki-Suonio et al.; Pierre 1990; Coles 1990; 
van de Weygaert 1991; SubbaRao & Szalay 1992) have used 
purely geometrical models such as cubic lattices and Voronoi 
foams to explore the implications of apparent regularities 
similar to those found by BEKS. In particular, SubbaRao 
and Szalay (1992) presented a sequence of Monte Carlo sim- 
ulations of surveys of Voronoi foams, showing that such a 
model can successfully reproduce the data as judged by a 
variety of statistical measures, for example, the heights, po- 
sitions and signal-to-noise ratios of the highest peaks in the 
power spectra. Kaiser & Peacock (1991) argued that the 
highest such peaks in the BEKS data are not sufficiently 
significant to be unexpected in a CDM universe, but did not 
support this conclusion with detailed simulations. Dekel et 
al. (1992) introduced other statistics, more similar to those 
of PC, and again concluded that the apparent periodicity 
seen in the real data is not particularly unlikely in any of 
the toy models they used for comparison. Their models in- 
clude Gaussian models with an extreme initial power spec- 
trum with power only on scales ~ 100ft~^Mpc. They found 
regular 'galaxy' distributions a few per cent of the time and 
concluded that the BEKS data do not rule out all Gaus- 
sian models. However, these theoretical studies did not give 
any clear answer to the question posed above: is the BEKS 
regularity compatible with the standard CDM paradigm? 
We attempt to answer this below using versions of all the 
statistical tools developed in earlier papers. 

There have been several interesting observational de- 
velopments after BEKS. WiUmer et al. (1994) found that, if 
the original BEKS deep survey at the North Galactic Pole 
had been carried out 1 degree or more to the west, many 
of the peaks would have been missed. On the other hand, 
Koo et al. (1993) added new data from a wider survey to 
the original BEKS data and found the highest peak in the 
power spectrum to be further enhanced. They also analysed 
another set of deep pencil-beam surveys and found a peak 
of weaker significance on the same scale, 128 /i"^Mpc. This 
raises another question: is 128 /i~^Mpc a preferred length 
scale for the galaxy distribution? Further support for such 
a preferred scale has been presented by TuUy et al. (1992), 
Ettori et al. (1997) and Einasto et al. (1997). Thus one can 
wonder whether a single scale could be indicated with such 
apparent consistency within the CDM paradigm. 

With the important exception of the work of PG there 
has been surprisingly little comparison of the BEKS data 
with direct simulations of standard CDM cosmogonies. Even 
before the BEKS discovery. White et al. (1987) had shown 
that pencil-beams drilled through periodic replications of 
their CDM simulations frequently showed a kind of 'picket 
fence' regularity in their redshift distribution. Frenk (1991) 
confirmed this result and concluded that regular patterns 
similar to that seen in the BEKS data are easy to find in 
their simulations. However, it is clearly dangerous to make 
use of periodic replications of a simulation when assessing 
the significance of apparent periodicities in the redshift dis- 
tribution. It is preferable to simulate a volume large enough 
to encompass the whole survey. Furthermore, since many 
independent artificial surveys are needed to establish that 
the real data are highly unlikely in the cosmogony simu- 
lated, the simulated volume must be fully three-dimensional 
(unlike that of PG) to allow the creation of many quasi- 



independent lines-of-sight. A final consideration is that the 
BEKS data reach to redshifts beyond 0.3, so that evolution 
of clustering along the survey may not be negligible. 

In this paper we investigate the distribution of 'galaxies' 
along the past light-cones of hypothetical observers. Particle 
positions and velocities on these light-cones were generated 
as output from the Hubble Volume Simulations (Evrard et 
al. 2000). These very large CDM A''-body simulations were 
recently performed by the Virgo consortium and each used 
lO'"* particles to follow the evolution of the matter distri- 
bution within cubic regions of an f2 = 1 rCDM [Q, = 0.3 
ACDM) universe of side 2000 h'^Mpc (3000 /i^^Mpc). Such 
large volumes allow many independent light-cones to be gen- 
erated out to z ~ 1. The light-cone output automatically ac- 
counts for clustering evolution with redshift. The principal 
uncertainty lies in how to create a 'galaxy' distribution from 
the simulated mass distribution. We employ Lagrangian bias 
schemes similar to those of White et al. (1987) and Cole et 
al. (1998). Individual particles are tagged as galaxies with 
a probability which depends only on the smoothed initial 
overdensity field in their neighbourhood. The parameters of 
these schemes are adjusted so that the present-day corre- 
lations of the simulated galaxies match observation. Many 
quasi-independent mock pencil-beam surveys can then be 
created adopting the geometry and the galaxy selection 
probability with distance of the BEKS surveys. 

Our discussions focus primarily on the significance of 
the BEKS data in comparison with our CDM samples. We 
begin by following the methods used originally by BEKS, 
namely, redshift counts, pair separation distributions, and 
the one- dimensional power spectrum. Redshift counts are 
translated into a distribution in physical distance assuming 
the same cosmological parameters as BEKS. For the one- 
dimensional power spectra, the height of the highest peak is 
the most important statistic. Szalay et al. (1991) show that 
the statistical significance of the highest peak of the BEKS 
data is at 10~* level, based on the formal probability for 
the peak height. This calculation was disputed by Kaiser& 
Peacock (1991) because of the difficulty in estimating the 
appropriate noise level. We calculate relative peak-to-noise 
ratios of the highest peaks in the power spectra in identical 
ways for real and simulated data and so can compare the 
two without needing to resolve this issue. We also apply two 
additional statistical tests for regularity, the A test of PG 
and a 'supercluster' statistic designed by Dekel et al. (1992). 

Our paper is organised as follows. In section 2 we 
present details of the A'^-body simulations from which our 
pencil-beam samples are drawn. In section 3 we explain our 
bias scheme. In section 4 we describe our mock pencil-beam 
surveys which mimic as closely as possible the actual ob- 
servations of BEKS. Our main results for power spectra are 
given in section 5. Results are given in section 6 for the 
A test, and in section 7 for the 'supercluster' statistic. We 
present our conclusions in section 8. 



2 A^-BODY SIMULATION 

The simulation data we use are the so-called "light-cone out- 
puts" produced from the Hubble Volume simulations (de- 
tails are in Evrard et al. 2001). The basic simulation pa- 
rameters are tabulated in Table 1, where Lbox is the box 
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Table 1. Parameters of the Hubble Volume Simulations 



Model Lbox fl A h as T rrip [Mq/K) 



tCDM 2000.0 1.0 0.0 0.5 0.6 0.21 2.22x10^ 



ACDM 3000.0 0.3 0.7 0.7 0.9 0.21 2.25 xlO^^ 



3.1 Lagrangian Bias 

Cole et al. (1998) developed and tested a set of bias schemes 
to extract 'galaxies' from Ai'-body simulations. The proce- 
dure we use for the first stage of our galaxy selection is sim- 
ilar to their Model 1, but has a different functional form for 
the probability function. Since we need a bias factor greater 
than unity for the rCDM model and less than unity for the 
ACDM model (a result of the differing mass correlations in 
the two cases) 'galaxies' need to avoid regions of low ini- 
tial density in the rCDM simulation and to avoid regions of 
high initial density in the ACDM simulation. We begin by 
smoothing the density field at an early time with a Gaussian, 
exp(— r^/2ro) with ro = 3^~^Mpc and assigning the corre- 
sponding overdensity 5 to each dark-matter particle. Then 
a normalised overdensity u — S/as is computed, where as is 
the root mean square value of particle (5-values. Finally, we 
define a probability function Pif) which determines whether 
a particle is tagged as a 'galaxy'. We random-sample dark- 
matter particles for tagging as galaxies based on this proba- 
bility. Once tagged as a galaxy in this way, particles remain 
tagged throughout the simulation, and so become poten- 
tially visible in our mock surveys whenever their world-line 
crosses a light-cone. 

For all the bias models described below the 'galaxy- 
galaxy' correlation function was calculated in real space 
within a cubic box of side 200 /i~^Mpc with the observer 
at one corner. These correlation functions are shown in Fig- 
ure ^ 

rCDM model bias tl: For the rCDM model, we 
chose a simple power law form P{iy) oc {u — VcY^'^ for the 
probability function. We impose a threshold ai v = ~ 
—0.55 below which the probability is set to be zero. This sup- 
presses the formation of 'galaxies' in voids. These parame- 
ters were determined by matching the present-day two-point 
correlation function of the 'galaxies' to the observational 
result for the APM survey (Baugh 1996, see Figure |^ on 
length scales from 0.2 /i~^Mpc to 20/i~^Mpc. We note here 
that in our A''-body simulations the gravitational softening 
length is 0.1/i~^Mpc. 

ACDM bias model LI: For the ACDM model, we 
must 'anti-bias' because the predicted mass correlations on 
small scales are substantially larger than observed galaxy 
correlations (see, for example, Jenkins et al. 1998). We set a 
sharp upper cut-off at Vc ~ 1.34, above which P{i^) is zero. 
All particles below this threshold are equally likely to be 
'galaxies' {P — const). Although this may seem unphysi- 
cal, more realistic modelling of galaxy formation in ACDM 
models does indeed produce the anti-bias required for con- 
sistency with observation, albeit through a more complex 
interplay of statistical factors (Kauffmann et al. 1999; Ben- 
son et al. 2000) . We use a simpler scheme in order to produce 
the desired two-point correlation function; on scales of in- 
terest here, only a small anti-bias is necessary. 

ACDM bias model L2 : For comparison purposes, we 
applied a second bias model to the ACDM simulation. We 
set an additional Zomer threshold at viow ~ —0.7 below which 
we again set P = 0. Thus the probability takes a non-zero 
(and constant) value only in the range i^iom < v < v^igh, 
where now Vhigh ~ 0.9. This model fits the observed cor- 
relations of galaxies just as well as LI but enhances the 



size in /i~^Mpc, F stands for the shape parameter of the ini- 
tial power spectrum and mp is the mass per particle; other 
notations are standard. 

The light-cone outputs are created in the following way. 
We define an observer at a specific point in the simulation 
box at the final time. The position and velocity of each par- 
ticle is recorded whenever it crosses the past light-cone of 
this observer, and these phase-space coordinates are accu- 
mulated in a single data file. The evolution of clustering 
with lookback time (distance from the observer) is auto- 
matically included in such data. As we require mock pencil- 
beam surveys which reach z ~0.5 (spanning ~1000/i~^Mpc 
in physical scale), such light-cone output is both realistic 
and desirable. We use stored data from two different light- 
cone outputs for each cosmology. Each covers one octant of 
a sphere, and they emanate in opposite directions from the 
same point. Figure |l| illustrates this geometry. In each case 
we use data out to a comoving distance of 1500/i~^Mpc, 
corresponding to redshift 0.77 in the rCDM model and 0.58 
in the ACDM model. For the rCDM case the total length 
covered is larger than the side of the simulation box, but 
this has a negligible effect on the mock BEKS surveys we 
construct. 



3 GALAXY SELECTION 

To create realistic mock surveys we have to select particles 
as galaxies with the same distribution in depth as the real 
data and with an appropriately 'biased' distribution relative 
to the dark matter. We do this in two stages. First we iden- 
tify a biased subset of the particles chosen according to the 
value of the smoothed linear mass overdensity at their posi- 
tion at high redshift. The parameters defining this identifi- 
cation are chosen so that the two-point correlation function 
of the identified 'galaxies' ai z — matches the observed 
correlation function of low redshift galaxies. For the rCDM 
model, we are able to achieve this while retaining about two- 
thirds of the simulation particles as 'galaxies'. The resulting 
comoving 'galaxy' number density is 0.08 h^Mpc~^. For the 
ACDM model we get a number density of 'galaxies' in the 
range 0.02 to 0.033 h^Mpc~^ depending on the bias scheme. 
This lower number density is due to the low number density 
of the dark-matter particles in this model. The second stage 
is to mimic the effect of the apparent magnitude limits of the 
real galaxy surveys by including 'galaxy' particles into the 
final mock catalogues with a probability which depends on 
distance from the observer. Since this stage is independent 
of the first, we are effectively assuming that the clustering 
of galaxies is independent of their luminosity. Our radial se- 
lection function is based on those directly estimated for the 
BEKS surveys. 
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emptiness of voids 

Figure ^ illustrates bias effects by comparing the dis- 
tribution of dark-matter particles and of 'galaxies' in a thin 
slice through part of the simulation box at z = 0. For model 
LI the effect is difScult to detect visually, whereas the ef- 
fect of the lower cut-off in model L2 is obvious. Similarly, 
for the rCDM model, underdense regions(voids) are clearly 
accentuated in the 'galaxy' distribution relative to the dark 
matter distribution. Cole et al. (1998) show similar plots to 
demonstrate how strong bias in high density model universes 
maps underdense regions in the mass distribution onto voids 
in the galaxy distribution. Such contrasted voids are gener- 
ally seen in strongly biased models regardless of the func- 
tional form of P{v)- 

We can quantitatively study the bias in our models by 
measuring the nonlinear biasing parameters introduced by 
Dekel & Lahav (1999) (see also Sigad et al. 2000; Somerville 
et al. 2000). We compute the slope b and nonlinearity b fol- 
lowing the procedure described in Somerville et al. (2000). 
Figure ^ show the biasing relation between the 'galaxy' den- 
sity field Sg and the dark matter density field 5d, smoothed 
with a 8/i~^Mpc scale top-hat filter. For each of our bias 
models, the mean biasing function b{Sd), and its moments 

b= . -2 , (1) 

are given in Figure ^ In the above expression we have used 
—< > for the standard deviation. Strong biasing in tl 
and anti-biasing (for 1 -I- 5^ > 0) in LI and L2 are clearly 
seen, and reflected in the values for the effective slope; b — 
1.44 for tl, 0.84 for LI and 0.90 for L2. 



4 SURVEY STRATEGY 
4.1 Geometry 

We construct artificial surveys with a geometry very sim- 
ilar to that of the data analysed by BEKS. This consists 
of four surveys - a deep and a shallow survey near each 
Galactic Pole. The northern deep survey lies within a cone 
of 40-arcmin diameter about the pole and is made up of a 
set of roughly circular patches each 5-arcmin in diameter. 
About 10 small patches were surveyed but not all were com- 
pleted by the time of writing so that the exact number of 
patches used in BEKS is unclear. For our artificial surveys 
we choose 9 circular patches within the 40-arcmin diameter 
cone, each of diameter 5 arcminutes. We place these irreg- 
ularly and ensure no overlaps between them. For model L2 
the number density of 'galaxies' is too small to match the 
observations, so we increased the diameter of our patches 
to 7-arcmin. Although this widening results in a slight in- 
crease in the effective survey volume, the small patches still 
lie well within the larger cone of 40-arcmin diameter. The 
volume increase is compensated in the radial selection we 
decribe below, so that the resulting 'galaxy' distribution is 
consistent with the desired distribution given in BEKS. For 
the deep-narrow pencil-beams, the transverse length scale 
is much smaller (the cone diameter is ~ 4/i~^Mpc at z=0.2 
where the radial selection function takes its maximum value) 
than the 100/i~^Mpc scale we address, so the increase in the 



patch width does not affect our results. In all cases the red- 
shift counts in all patches were binned together to create a 
single deep survey. The northern shallow survey has a sim- 
pler geometry. A square area of about 14 square degrees is 
selected near the Galactic Pole, but with its centre offset 
by 7 degrees. The magnitude limit of the shallow survey is 
about 5 magnitudes brighter than that of the deep survey. 

Towards the South Galactic Pole both surveys are cen- 
tred very close to the pole itself. The deep survey is confined 
within a cone of 20-arcmin diameter, while the shallow sur- 
vey covers an area of 14 square degrees and has a magnitude 
limit about 4 magnitudes brighter. 

When making an artificial survey we choose a random 
direction in the simulation as the Galactic polar axis and 
then define all areas on the artificial sky with reference to 
this direction. The light-cone outputs from our Ifubble Vol- 
ume simulations cover enough 'sky' to allow us to make well 
over 1000 near-independent artificial surveys. 

4.2 Radial selection 

'Galaxies' projected in our survey regions are assigned 
weights for selection depending on their distances. We use 
the estimated galaxy distributions given in Figure 1 of BEKS 
to define the relevant selection functions for each survey. The 
data are read off in redshift bins of width Az — 0.005 for the 
deep surveys and Az — 0.001 for the shallow surveys. We 
then derive a smoothed model galaxy distribution dN/dz 
for each survey and compute the corresponding comoving 
number densities from the number counts and the volume 
elements given by the survey geometry and the assumed cos- 
mology. As explained above, we had to increase the size of 
the patches in the northern deep survey in case L2 in or- 
der to get the correct mean counts. This is easily accounted 
for by appropriate renormalisation. These radial selection 
functions are used as sampling probabilities to determine 
whether a particular 'galaxy' is included in a catalogue or 
not. We normalise our probabilities by matching the mean 
number of 'galaxies' in each survey to the number of galaxies 
in BEKS data. This matching is done for the 4 surveys in- 
dependently. For consistency, the normalisation coefficients 
obtained are then kept constant when constructing all real- 
isations for a particular model. 

4.3 Peculiar velocities 

The peculiar velocities of 'galaxies' must be taken into ac- 
count to create realistic mock redshift surveys. We simply 
assign our 'galaxies' the peculiar velocities of their corre- 
sponding dark-matter particles. Thus, while the spatial dis- 
tribution of 'galaxies' is biased, there is no additional bias 
associated with their peculiar velocities. On small scales pe- 
culiar velocities lead to 'finger-of-God' effects which suppress 
power in the apparent spatial distribution at high wavenum- 
ber. In our mock catalogues the root mean square values of 
the 'galaxy' line-of-sight peculiar velocities are 342 km/sec 
in tl, 358 km/sec for LI and L2. The redshift bin width 
shown in BEKS is A2:=0.005 for the deep surveys, which 
translates ~1500 km/sec in recession velocity. Therefore, the 
assigned peculiar velocities of 'galaxies' do not smear out the 
true width of clumps in one-dimensional distributions, while 
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they reflect properly the underlying velocity field. At the 
intermediate and small wavenumbers corresponding to the 
linear and quasi-linear regime, they increase the apparent 
power (e.g. Kaiser and Peacock 1991). These line-of-sight 
distortions reflect the enhanced contrast produced by infall 
onto superclusters. It is thus important to include the pecu- 
liar velocities when comparing simulations to the structures 
seen in the BEKS data. 



5 GALAXY DISTRIBUTION AND POWER 
SPECTRUM ANALYSIS 

The geometry of our light-cone datasets allows the axis of 
our artiflcial surveys to lie anywhere within one octant of 
the 'sky' (see Figure |l|). We construct an ensemble of mock 
surveys with axes distributed uniformly across this octant 
in such a way that the areas covered by the corresponding 
deep surveys do not overlap. We end up with 1936 quasi- 
independent deep surveys for our rCDM model. For the 
ACDM case an additional pair of light-cone outputs were 
stored, allowing us to construct 2625 disjoint deep surveys. 
To these deep pencil beams we add shallow surveys, whose 
volumes then have slight overlaps with those of neighbouring 
surveys. In practice, however, rather few 'galaxies' appear in 
more than one of our mock catalogues. 

A series of plots of the redshift distribution and de- 
rived statistics are given for selected 'mock BEKS surveys' 
in Figure 6, which consists of 6 sets of 3 figures. These can 
be compared with Figure 5, which is actually for the real 
BEKS data, which we reproduce here for comparison with 
our simulation results. We read these data from Figure 2 
in Szalay et al. (1991) where they are given as a histogram 
of bin width 10 h~^Mpc; when necessary for our analysis, 
we assume that the galaxies in each bin are uniformly dis- 
tributed across the bin. The particular mock surveys in the 
following 6 plots were chosen to illustrate a variety of points 
made in the following sections. 

5.1 One-dimensional distribution 

In each set of plots in Figure 6, the top panel shows the dis- 
tance histogram of 'galaxies' in the combined deep and shal- 
low surveys. The total number of galaxies in these combined 
surveys is given in this panel. We have assumed an Einstein- 
de Sitter universe for both of our models when converting 
redshift to physical distance, although the actual value of 
Q is 0.3 in the ACDM case. This apparent inconsistency is 
needed to allow a direct comparison with the analysis in 
BEKS where f2 = 1 was also assumed. Szalay et al. (1991) 
noted that using low values of fl to convert redshift to dis- 
tance reduces the significance of the apparent periodicity in 
the BEKS data. Throughout this paper we assume Q = 1 
for this conversion. 

In Figure 6, panel (a) shows one of the best catalogues in 
our tl ensemble in that it gives the impression that 'galaxies' 
are distributed periodically and, in addition, the 1-D power 
spectrum shows a sharp and high peak. Panel (b) shows the 
same features but with a smaller characteristic spacing. In 
each of these plots we mark the best periodic representation 
of the data in the same way as BEKS. We determine the 
characteristic spacing from the position of the highest peak 



in the power spectrum, and we adjust the phase to match 
the positions of as many big clumps as possible. The char- 
acteristic spacing is indicated by the vertical dashed lines in 
the top panels in Figure 6. Panel (c) shows a good example 
whose power spectrum has a very high peak while the actual 
distance distribution does not show a periodic feature (dis- 
cussed in section 5.3). Panels (d) and (e) show the best ex- 
amples from our model LI and L2, respectively, which show 
a good visual impression that 'galaxies' are spaced regularly. 
Finally panel (f) shows an example from model L2, which 
has a large characteristic length scale of ~ 200/i^^Mpc. 

5.2 Pairwise separation histograms 

From the apparent distance distributions of the 'galaxies', 
it is easy to produce histograms of pairwise distance differ- 
ences which can be used to search for characteristic scales 
in the structure within our pencil-beam surveys. Such pair 
counts are shown in the middle plot of each panel in Fig- 
ure 6. These counts typically display a series of peaks and 
valleys which are particularly prominent in panels (a), (e) 
and (f), and, as noted by BEKS themselves, in the original 
BEKS data. Notice that these peaks appear regularly spaced 
as indicated by the dashed lines in these panels. The con- 
trast between peaks and valleys can be used as a measure of 
the strength of the regularity. For the BEKS data the height 
difference between the first peak and the first valley is about 
a factor of 3, while the corresponding numbers are 2.4, 2.2 
and 3.4 in panels (a), (e), and (f), respectively. Many of our 
artificial samples show a more complex behaviour, however. 
In panel (b) there is a deep valley at 150/i~^Mpc followed by 
a high peak at 200/i~^Mpc; the contrast is a factor of 5.3 de- 
spite this uneven spacing. A robust and intuitive definition 
of contrast is difficult to find. An alternation of small-scale 
peaks and valleys can coexist with apparently significant 
larger scale variations as is clearly seen in panel (c). If we 
focus specifically on the strongest peaks and valleys, their 
ratio, and indeed even their identification can depend on the 
specific binning chosen for the histograms. Because of these 
ambiguities we will not use these distributions further for 
quantitative analysis in this paper. 

5.3 Power spectra 

In order to compare our results directly with BEKS we cal- 
culate one- dimensional power spectra for our samples using 
the method described in Szalay et al. (1991). Each galaxy 
is represented by a Dirac delta-function at the distance in- 
ferred from its redshift (including its peculiar velocity) . The 
power in each Fourier component is then 

■^^ ^ X! exp(27riA:r-„), (2) 

Pk = \fk\' (3) 

where Ng is the total number of galaxies in the sample. The 
power spectra calculated in this way for our various sam- 
ples are shown in the bottom plots of each panel in Figure 
6. Our units are such that the wavelength corresponding 
to wavenumber k is 1000/fc /i~^Mpc. In panels (a) and (b) 
visual impression of the separation of clumps is consistent 
with the wavelength inferred from the power spectra. The 
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highest peak is at fc=8.0 in panel (a) and k =16.0 in panel 
(b), giving wavelengths of 125/i~^Mpc and 62.5/i~^Mpc, re- 
spectively. As we shall see, there is no unique length scale 
inferred from the power spectra. 

If a pencil-beam penetrates a rich cluster, an interesting 
feature can arise. For example, in panel (c) there is a sin- 
gle large cluster at 600 /i~^Mpc. Together with a few other 
clumps of moderate size, it produces a very high peak in 
the power spectrum without the distance distribution as a 
whole giving a visual impression of regularity (c.f. the top 
panel of panel (c)). Many of our samples in both the rCDM 
and ACDM ensembles show high peaks in the power spectra 
with no apparent periodicity. Thus a very high peak in the 
power spectrum, particularly at low wave-number, is a poor 
indicator of the kind of regularity which is so striking in the 
original BEKS data. Interestingly, as Bahcall (1991) pointed 
out, if one of the BEKS survey beams passed near the cen- 
tre of a rich cluster, the galaxy count in the corresponding 
distance bin would have been much larger than the maxi- 
mum of 22 seen in the actual BEKS data (see also Willmer 
et al. 1994). (For comparison, the maximum bin count in 
the histogram of panel (a) is 23.) We note that the Poisson 
sampling noise in each power spectrum can be estimated as 
1/Ng. As a result, a big clump raises the statistical signifi- 
cance of 'structure' both by enhancing the strength of peaks 
and by lowering the estimated noise. 

In order to compare samples with difi'erent total num- 
bers of 'galaxies', we calculate the signal-to-noise ratio of 
the highest peak in the power spectrum following the pro- 
cedure of Szalay et al. (1991). We define the peak-to-noise 
ratio of a sample as, X=(peak height) /(noise level) where 
the noise is estimated from the sum in quadrature of the 
Poisson sampling noise and the clustering noise, 

^°-]^ + l' 

where, as before, Ng is the total number of galaxies in the 
sample, is the small-scale two point correlation function 
averaged over a cell of depth 30/i~^Mpc, and M is the 
number of cells along the survey axis. We use the value 
^o/M = 1/80 as in Szalay et al. (1991). Although Szalay et 
al. derived this formula from a simple model with cylindrical 
geometry, they showed that this estimator agrees well with 
another internal estimator calculated from the cumulative 
distribution of power. To facilitate direct comparison with 
the earlier work we also use equation (3) to compute signal- 
to-noise ratios for the highest peaks in our samples. These 
S/N ratios are given in each of the power spectrum plots 
in Figure 6. Figure ^ shows both the differential and the 
cumulative distribution of peak-to-noise ratio in our mock 
surveys. The principal difference between the rCDM and 
ACDM ensembles lies in the position of the peak in the dif- 
ferential count. For both LI and L2 the peak is at smaller 
X than in tl. This difference can be traced to the value of 
we assume for analysis. Adopting Q — 1 for converting red- 
shift to physical distance causes the value of M, the number 
of cells of width 30/i~^Mpc along the survey axis, to be un- 
derestimated for ACDM. Using the noise estimator (3) with 
this value of M then overestimates the noise levels for LI 
and L2 (see SubbaRao and Szalay (1992) for discussion of 
a similar point). 



We plot in Figure |8| the wavenumber distribution of the 
peaks whose S/N ratios are higher than that of the original 
BEKS data (X=11.8). We find 14 samples satisfy this con- 
dition in tl, 7 in LI and 13 in L2. The distributions of the 
peaks with X > 8.0 (the 'tails' of the number count in Fig- 
ure ^ are also shown in Figure ^. By checking the distance 
distributions we have found that highly significant peaks at 
fc < 5 are almost always due to one or two strong clumps, as 
noted above. Very few catalogues give a high peak on scales 
similar to the BEKS data. It is interesting that the frequency 
of such catalogues is significantly higher in tl than in LI 
and L2. The difference is primarily due to the number den- 
sity of rich clusters over the redshift range surveyed. Deep 
pencil-beams in our LI and L2 models have more chance 
than in tl to hit a rich cluster at redshift ~ 0.3 — 0.5. Then 
high peaks in the power spectra tend to appear on small 
wavenumbers in LI and L2 for the reason explained above. 

Overall, we conclude that although roughly half a per 
cent of our mock surveys give a power spectrum peak 
stronger than that of the BEKS data, very few of these 
actually correspond to redshift distributions with similar 
regularity and similar spacing of the spikes. We now study 
this further by considering two additional tests for regularity 
which have been used on the BEKS sample. 



6 PG A-TEST 

In comparing their own simulation to the BEKS data. Park 
& Gott (1991) made use of a test specifically designed to 
probe the apparent "phase-coherence" of the series of red- 
shift spikes. For each 'galaxy' they calculated the distance to 
the nearest tooth of a perfectly regular comb-like template. 
They then ratioed the mean of this distance to the separa- 
tion of the teeth, and minimised the result over the period 
and phase of the template. Let us call the resulting statistic 
A. Then a distribution in which each galaxy is at some node 
of a regular grid will give A = 0, and a uniform distribution 
in distance would give A = 0.25 in the large-sample limit. In 
our application of this test we restrict the range of possible 
periods to 50 - 500 Mpc. For the BEKS data we obtain 
A=0.165 for a best period of 130/i~^ Mpc. Our value of A 
differs from that given by PG because they applied the test 
only to the deep surveys while we use the combined deep 
and shallow BEKS data. Among the 1936 samples in our tl 
ensemble, 209 have lower values of A than the BEKS data; 
for the LI and L2 ensembles the corresponding numbers 
are 134/2625 and 127/2625 respectively. According to this 
test, therefore, the observed sample appears only marginally 
more regular than expected in our CDM cosmologies. 

Within each of our ensembles the significance of the reg- 
ularity in the BEKS data appears somewhat higher than was 
estimated by PG. Their simulation ensemble was made up 
of only 12 mock catalogues of which one had lower A than 
the BEKS deep data. The median A for these twelve was 
0.1695, while we find medians of 0.176, 0.180 and 0.189 for 
the combined deep and shallow data in ensembles tl, LI and 
L2 respectively. The difference with PG is probably small 
enough to be attributed to the small number of realisations 
in their ensemble. Figure ^ shows the distribution of periods 
for catalogues in each of our own ensembles with lower As 
than the BEKS data. It is interesting that relatively small 
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periods are favoured and that there is no preference for val- 
ues in the range 120 - 130/i"^Mpc. Ettori et al. (1997) used 
a related test, the comb-template test (Duari et al. 1992), 
to analyse four pencil-beam surveys near the South Galactic 
Pole. They found a best period near 130 h~^Mpc in two of 
these four directions, in apparent agreement with the BEKS 
result. 

In summary, the difference in regularity between the 
BEKS sample and our CDM mock-catalogues is less signifi- 
cant when measured by the A-test than when measured us- 
ing the power spectrum test of the last section. Nevertheless, 
for periods near 125/i~^Mpc there are few CDM samples 
which are more regular than the BEKS data. In addition, 
our result appears insensitive to the choice of biasing; we 
find essentially no difference between LI and L2 in Figure 
^ This is puzzling since Figure ^ shows clear differences in 
the emptiness of the voids in the two cases. Apparently the 
value of A is more sensitive to departures from regularity in 
the spacing of the walls than it is to the density contrast of 
the voids. 



periodic distribution the unit vectors would all be identi- 
cal so that V — 1 and R = 0. For a distribution with no 
long-range phase coherence the directions of the unit vectors 
would be random and so in the large sample limit V ~ and 
i? ~ 1. Small values of R are thus expected for near-periodic 
distributions. 

For the BEKS data, we find R = 0.33 for a period of 
130 h~^Mpc obtained as described above. Lower values of R 
are found for 46 samples in tl, for 66 samples in LI, and for 
56 samples in L2. Thus, according to this test the superclus- 
ter distribution in the BEKS data is more periodic than the 
CDM models at the 2.4 per cent significance level for tl, the 
2.5 per cent level for LI and the 2.1 per cent level for L2. 
The period distribution of the samples with R < _R(BEKS) 
is shown in Figure |l^. Many of these low- 7? samples have pe- 
riods in the range [100/i"^Mpc, 140/i"^Mpc]. Thus in CDM 
model universes it is common for supercluster spikes to have 
a typical separation similar to that seen in the BEKS data 
and in a few per cent of cases the spikes are just as regularly 
spaced as in the real data. 



7 SUPERCLUSTER STATISTICS 

Dekel et al. (1992) proposed an alternative technique for 
assessing apparent periodicity in data samples like that of 
BEKS. In this section we use the term 'supercluster' to re- 
fer to clumps in the one-dimensional redshift histograms de- 
rived from such pencil-beam surveys, even though these do 
not correspond precisely to the superclusters (or walls or fila- 
ments) seen in fully three-dimensional surveys. The method 
of Dekel et al. is based on the redshift distribution of super- 
cluster centres rather than on that of individual galaxies. 
The first step is to correct the galaxy redshift histogram for 
the survey selection function. We do this by weighting each 
galaxy by the inverse of the selection function for the par- 
ticular survey of which it is a part (North or South, shallow 
or deep). This reverses the procedure by which we created 
our mock catalogues from the simulations. To avoid overly 
large sampling noise where the selection function is small, 
we restrict our redshift histograms to z < 0.31 for the SGP 
survey and z < 0.5 for the NGP survey (see Dekel et al. 
1992). We smooth these histograms with a Gaussian of vari- 
ance and identify supercluster centres as local maxima of 
the result. (Note that, following Dekel et al., no threshold 
is applied.) We have tried smoothing lengths / between 20 
and 40 h~^Mpc, but find our results to be insensitive to the 
exact value within this range. In what follows we set I —25 
ft-^Mpc. 

Given a distribution of the supercluster centres, the 
characteristic period is determined in the following way. As 
a first estimate we take the mean separation Lm between 
neighbouring superclusters. Next we apply the PG A-test 
for periods p G [0.5Lm, 2.0Lm]- The value of p in this range 
which minimises A is taken as the characteristic period of 
the distribution. For this period we calculate the Rayleigh 
statistic R as follows (Dekel et al. (1992) and Feller(1971)). 
The positions of the supercluster centres are mapped onto 
a circle of circumference p. Consider the n unit vectors Ui 
which point from the centre of the circle towards each of 
the n superclusters. Denote their vector average by (u) , the 
modulus of (u) by V and define R = 1 — V . For an exactly 



8 DISCUSSION AND CONCLUSION 

By creating a number of mock pencil-beam surveys we have 
compared the apparent periodicity in two CDM model uni- 
verses with that observed in the data of Broadhurst et al. 
(1990). The power spectrum analysis alone shows that the 
BEKS data are significantly more periodic than the mod- 
els at about the half per cent level, while the PG A-test 
shows less significance, about 10 per cent for tl and 5 per 
cent for LI and L2. The supercluster statistic gives a two 
per cent probability of finding a structure as regular as the 
BEKS data. Restricting to a length scale ~100-150/i~^Mpc, 
however, the number of samples which show the kind of peri- 
odicity seen in the BEKS data is extremely small for each of 
these statistics. Overall no sample is more regular than the 
BEKS data for allihiee statistics for a single period. The two 
popular CDM models we have studied here are apparently 
unsuccessful in reproducing the observed periodicity. From 
this result together with the fact that the statistical results 
appeared to be insensitive to the choice of the bias model, 
we conclude that CDM models conflict with the BEKS ob- 
servation. Either the models need additional physics, or the 
data are a fluke or are somehow biased. 

Various possible physical explanations have been pro- 
posed, such as coherent peculiar velocities (Hill, Steinhardt 
and Turner 1991) oscillations in the Hubble parameter 
(Morikawa 1991) or baryonic features in the power spec- 
trum (Eisenstein et al. 1998) but all of them seem to re- 
quire either additional mechanisms with flne tunings beyond 
the standard theory or cosmological parameters significantly 
different from currently favoured values. Intriguingly, Dekel 
et al. (1992) demonstrated that built-in power on a large 
(~ 100/i~^Mpc) length scale in the initial density fluctu- 
ation could indeed reproduce periodic features on a given 
scale, at least by some of the tests we have considered. If 
such excess power on large scales (hence still in the linear 
regime) exists, it will be detectable in the power spectra of 
the future 2dF and Sloan surveys. 

Having found at least a few examples that are nearly as 
periodic as the BEKS data, we cannot rule out the possibil- 
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ity that the BEKS data (or the Galactic Pole direction) are 
a fluke. On the other hand, one should be aware of the com- 
plexity of the original observations -an incomplete compila- 
tion of a narrow and deep, and of a wide and shallow survey 
at each of the Galactic Poles. It is not clear whether such a 
combination constitutes a fair sample. No evidence for peri- 
odic structure on ~130/i~^Mpc has been found so far in two 
other deep redshift surveys, the ESO-Sculptor Survey (Bel- 
langer and de Lapparent 1995) and the Caltech Faint Galaxy 
Redshift Survey (Cohen 1999). Follow-up observations to 
BEKS by Koo et al. (1993) did not show a strong regularity 
in two other directions, although around the Galactic Pole 
the regularity was found to be further strengthened. Our 
results give the a priori probability for such apparent peri- 
odicity in CDM models. Several more deep surveys might 
suffice to judge whether the discrepancy with BEKS reflects 
a major inconsistency. The planned VIRMOS Deep Survey 
(Le Fevre et al. 1998, see also Guzzo 1999) will survey the 
range 0.3 < z < 1 and will provide, together with the large 
volume 2dF and Sloan surveys, much larger and more com- 
plete samples in the near future. 
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Figure 1. The shape of a light-cone and the observer point. The radius of the sphere is 1500 h ^ in both the rCDM and the ACDM 
models. 
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Figure 2. The two-point correlation functions of the biased 'galaxies'. The dash-dotted line is the 'galaxy-galaxy' correlation of our 
model bias tl and the solid line is for bias LI, the dashed line for bias L2. The open squares are the observational data from the APM 
survey (Baugh 1996) and the dotted line is the assumed galaxy correlation function in Szalay et al. (1991) for the analysis of the BEKS 
survey. The curves fit well both the APM data and the Szalay et al. model. 
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Figure 3. The distribution of the simulation dark-matter particles(left panels) and the biased 'galaxies' (right panels) in real space. The 
panels show 200 X 200 X 10 Mpc'^ slabs. Note the difference in the number density of the simulation dark-matter particles between 
the rCDM model (0.125ft,^Mpc~^) and the ACDM model {O.OST/i^Mpc"^). Strong-bias effects are apparent in tl and L2 whereas in LI 
essentially no bias is seen. 
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Figure 4. The joint distribution of the overdensity fields of 'galaxies' and mass, both smoothed with a 8/i~^Mpc tophat window. The 
grey lines show the mean biasing function 6(<5). Dashed lines show a linear relation &g = for a reference. The measured biasing 
parameters b and b/b are given in each panels. 
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Figure 5: The BEKS data 
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Figure 6: Selected samples, (a) tCDM model tl. 

max{P(fc)}=0.145 at yfc = 8.5 
A = 0.164 with period UOh'^Mpc 
R = 0.574 with period lOh'^Mpc 
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(b) rCDM model tl. 



max{P(fc)}=0.149 at yfc = 16 
A = 0.167 with period 220/i-iMpc 
R ^ 0.576 with period igO/i-^Mpc 
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(c) rCDM model tl 

max{P(fc)}=0.201 at fc = 2.5 
A = 0.152 with period 830/i~iMpc 
R = 0.505 with period leO/i^^Mpc 



30 



20 - 



1 O - 




1 ooo 



-500 O 

R (Mpc/h) 



1 OOO 





0.08 



0.04 



© 2000 RAS, MNRAS 000,|l|-| 



Simulations of deep pencil-beam redshift surveys 17 



(d) ACDM model LI 



max{P(fc)}=0.150 at k = 5.0 
A = 0.159 with period 200/i-iMpc 
R = 0.358 with period 200/i-^Mpc 
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(e) ACDM model L2 
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(f) ACDM model L2 

max{P(fc)}=0.176 at k = 5.0 
A = 0.155 with period IdOh'^Mpc 
R = 0.572 with period 190ft.~iMpc 
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Figure 7. The cumulative (histogram) and the difTerentiaI(dot-dashed line) distribution of the peak-to-noise ratio are plotted for rCDM 
bias tl(top), ACDM bias Ll(middle), and bias L2(bottom). Note the difference in the total number of samples in the two models. The 
arrow indicates the peak-to-noise ratio of the BEKS data. 
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Figure 8. The distribution of wavenumbers for the high peaks. The shaded histograms show the number counts of the peaks with 
X > Xbbks I S'lid the unshaded histograms show those with X > 8.0. The arrow indicates the peak wavenumber of the BEKS data. 
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Figure 9. A-test: the number counts of selected samples with A < A(BEKS) are plotted against the measured period (see text). The 
period of the BEKS data is shown by an arrow at 130 h~^Mpc. 
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Figure 10. The Rayleigh statistic R. The number counts of -R < -R(BEKS), plotted against the period assigned to each of the samples 
for the Rayleigh statistic. The arrow indicates the period for the BEKS data. 
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